SPEECH   Most Macs can synthesise speech if the necessary system software is installed. A PowerMac with extra software can even respond to your spoken commands! Synthesised Speech qqqqqqqqqqqqqqq Synthesised speech on the Mac is called text-to-speech (TTS). Any text, such as that in a document or dialog can be converted into speech using a software synthesiser. To use TTS you’ll need a application to works with it — if you can’t find anything else try SimpleText! System Software for Speech Synthesis wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww TTS needs special software in your System Folder. The Easy Install option in the System Installer provides it automatically — if not, you’ll have to use Custom Install. The Mac Plus doesn’t support TTS and the presence of its software can cause a freeze during startup — hold down Space during startup to disable it with Extensions Manager. To use TTS the following files must be in the System Folder:- ø In the Extensions folder: Speech Manager MacinTalk 2 and/or Macintalk 3 and/or MacIntalk Pro synthesiser Voices folder with voices matching the version of Macintalk æ In the Control Panels folder: Speech control panel Speech Synthesisers The MacinTalk 2 and 3 speech synthesisers are included in the system software but the Macintalk Pro synthesiser is a separate product. MacinTalk 2 is suitable for any Mac with a 68000, 68020 or 68030 processor running at under 33 MHz. Even if you use Macintalk 3 you’ll still need MacinTalk 2 for MacinTalk 2 voices! These voices, such as Boris and RoboVox, only use a small amount of RAM. Macintalk 3 is an improved version for 68030 Macs running at 33 MHz or more. MacIntalk Pro is for 68040 Macs or PowerMacs only. It’s voices can use up to 5 M of RAM each. Since they’re mainly contained in system memory you shouldn’t need to keep adjusting the memory assigned to a speech application — but the memory must still be available! Compressed voices require much less memory. The Gala Tea voices supplied with PlainTalk (see below) include TTS Male, Agnes, Bruce and Victoria — their RAM requirements are similar to the MacinTalk Pro voices. Ú The varied upper-case letters in these file names aren’t a mistake! The Voices Folder The voice files the Voices folder can only be used by the appropriate Macintalk synthesiser. All available voices appear in the Speech control panel and speech application menus. Ú A voice file will only work if its matching synthesiser is available. The voice file type is shown by the number on its icon. The Speech Control Panel   This panel is used to set the defaults for speech synthesis and recognition. You can choose a default Macintalk voice and specify its rate of delivery. Just click on the loudspeaker box for a sample!   Speech Synthesis in Applications wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww With the appropriate system software installed you can use TTS in any application that supports it. For example, SimpleText will read out the contents of a file — you can choose a voice from the Sound menu.     So To Speak is an application that demonstrates TTS to the full. The voices are selected using a pop-up menu and during speech the pitch and rate of delivery can be adjusted — the speech can be paused and stopped at any point. Both So To Speak and Speaker (a neat application for reading text files) can use an external file containing a pronunciation dictionary (see below). Voice Parameters Some applications let you modify the parameters of a voice. Here’s the dialog for an application called DictionaryEdit (see below):-   The parameters are:- Rate Speed of delivery, as in the Speech control panel Pitch Underlying tone of voice — higher for females! Modulation Inflection, emphasising parts of words or sentences Speech Synthesis in System Additions wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww   Special control panels or extensions can be used to add extra TTS features to your Mac. For example, Speak2Me will read out the name of a selected icon in the Finder whilst SpeakAlert will speak the contents of any alert box. Pronunciation Dictionaries wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww   TTS often pronounces words incorrectly — especially the names of people and places. A pronunciation or speech dictionary can be used to tell a synthesiser how to pronounce these words. The dictionary describes the pronunciation of each word with phonemes and prosodic controls — each represented by a sequence of characters. Phonemes describe the sound of each syllable whilst prosodic controls add the stress and intonation for natural speech. If a speech application finds a word in its dictionary, the synthesiser uses the dictionary’s entry instead of the standard conversion rules. A dictionary can substitute the word by an alternative spoken word — but it’s more often used to optimise the pronunciation. The dictionary comes in the form of a dict resource that’s loaded by the speech application prior to use. This resource can be in the application itself or in a file of Type dict or rsrc. Most dictionaries don’t support abbreviated entries and only allow two fields per entry — the first is the text , the second is the phoneme. Both entries mustn’t contain more than 256 characters. The list inside a dictionary isn’t always sorted into any particular order! Inside a Pronunciation Dictionary   DictionaryEdit is an application for modifying dicts. Several dictionaries can be open at once, but you can only edit one dict at a time within a single file. You can cut and copy (or drag) entries between dictionaries or fill a dictionary by converting its text into phonemes. When you select Open Dictionary… you’re presented with this window:-   The left-hand box contains each text entry in the dictionary. The lower box to the right contains a list of the phonemes and prosodic controls that make up the word samuel. When you click on the left-hand icon the selected text is automatically converted into its component parts and placed in the phoneme list. You can then edit the phonemes or add prosodic controls as necessary. As you work you can listen to the results by clicking on the right-hand icon. You can also open and edit text files for to try out your new dictionaries. Phonemes Phonemes are components of speech that are represented by case sensitive symbols. For example, the vowel in the words bout and how are both represented by the AW phoneme — even though they’re spelt differently! The full list of phonemes is:- AE bat EY bait AO caught AX about IY meet EH bet IH bit AY bite IX closes AA cot UW boot UH book UX mud OW boat AW bout OY boy b bin C chin d dark D those f fake g gain h hat J gin k kin l limb m mat n knock N tang p pin r ran s satin S shin t tin T thin v van w wet y yank z zen Z genre % silence @ breath intake For example, the word application is AEplIHkEYSAXn. Prosodic Codes Prosodic codes (or prosody symbols) are used with phonemes to fine tune the pronunciation. They include:- Symbol Meaning 1 Primary Stress 2 Secondary Stress = Syllable Mark ~ Unstressed _ Normal Stress + Emphatic Stress / Pitch Rise \ Pitch Fall > Lengthen phoneme < Shorten phoneme . Sentence final fall ? Sentence final rise ! Sentence final sharp fall … Clause final level , Continuation rise ; Continuation rise : Clause final level ( Start reduced range ) End reduced range “ Varies ‘ Varies ” Varies ’ Varies - Clause final level & Forces no silence between phonemes The stress codes (1 and 2) indicate which syllables should be emphasised. For example, the word anticipation could be in the form:- AEnt2IHsIXp1EYSAXn The syllable codes (=) break the word into syllables:- AEn=t2IH=sIX=p1EY=SAXn The word prominence codes (~, _ and +) indicates the need to stress a particular word. The true prosodic codes (/, \, > and <) can be used to modify a phoneme to make a word sound more natural. The Dictionary Header Some applications let you edit the dictionary’s header — most users won’t need to modify it. In DictionaryEdit you’re presented with this window:-   The parameters are:- Parameter Usual contents Atom type This is dict for a standard dictionary Format version This is 1 for a pronunciation dictionary Script For the Roman system this is 0 Language Code for English is 0 Region Code for the USA is is 0; for the UK it’s 2 Date Last Modified Last date the dictionary was modified. Dict size Total byte length of the dictionary The size indicates how much memory the dictionary will use when loaded. Speech Recognition qqqqqqqqqqqqqqq The PlainTalk software package lets you speak instructions to your Mac — but only if it’s an AV Mac or PowerMac. If you install MacIntalk Pro as well it will reply in good voice! Ú For full details see the information supplied with the PlainTalk package. To use speech recognition you’ll need a PlainTalk microphone. The circular microphone supplied with many Macs isn’t suitable — but the one built into an AV monitor is! Many Macs can’t accept a PlainTalk microphone! ù See the Sound chapter for more about microphones and sound inputs.   You’ll also need the Speakable Items software, the Speech control panel and the Speech Recognition extension. You should also install the software for TTS! Other necessary extension files include: Extension Function Speech Macro Editor Defines Mac instructions My Speech Macros Contains your instructions SR Monitor Monitors and interprets speech System Speech Rules For different voices and speech dialects With recognition enabled, a list of spoken commands appear in the Speakable Items folder in the  menu. The Listening option in the Speech control panel lets you pick a key combination to switch the Mac into listening mode — it then accepts commands. ©Ray White. All Rights Reserved 1997